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1 CTCGAGAAAT CATAAAAAAT TTATTTGCTT 
61 ATTGTGAGCG GATAACAATT TCACACAGAA 
121 CTTAGTGGGA TCCGCATGCG AGCTCGGTAC 
181 GCGGAAATCA GTGGTCACAT CGTACGTTCC 
241 AGCCCGGACG CAAAAGCGTT CATCGAAGTG 
3 01 TGCATCGTTG AAGCCATGAA AATGATGAAC 

3 61 AAAGCAATTC TGGTCGAAAG TGGACAACCG 
421 GAGGGTGGCA GCGGTTCTGG CCACCATCAC 

4 81 GACTCCTGTT GATAGATCCA GTAATGACCT 
541 TCGGTTGCCG CCGGGCGTTT TTTATTGGTG 
601 GGAGCTAAGG AAGCTAAAAT GGAGAAAAAA 
661 CAATGGCATC GTAAAGAACA TTTTGAGGCA 
721 CAGACCGTTC AGCTGGATAT TACGGCCTTT 
781 TTTTATCCGG CCTTTATTCA CATTCTTGCC 
841 ATGGCAATGA AAGACGGTGA GCTGGTGATA 
901 TTCCATGAGC AAACTGAAAC GTTTTCATCG 
961 CAGTTTCTAC ACATATATTC GCAAGATGTG 

1021 CCTAAAGGGT TTATTGAGAA TATGTTTTTC 
1081 AGTTTTGATT TAAACGTGGC CAATATGGAC 
1141 AAATATTATA CGCAAGGCGA CAAGGTGCTG 
1201 GTTTGTGATG GCTTCCATGT CGGCAGAATG 
1261 TGGCAGGGCG GGGCGTAATT TTTTTAAGGC 
1321 ATGACTCTCT AGCTTGAGGC ATCAAATAAA 
1381 TCGTTTTATC TGTTGTTTGT CGGTGAACGC 
1441 ATTACGTGCA GTCGATGATA AGCTGTCAAA 
1501 CTTACATTAA TTGCGTTGCG CTCACTGCCC 
1561 CTGCATTAAT GAATCGGCCA ACGCGCGGGG 
1621 GGTTTTTCTT TTCACCAGTG AGACGGGCAA 
1681 AGAGAGTTGC AGCAAGCGGT CCACGCTGGT 
1741 GGTGGTTAAC GGCGGGATAT AACATGAGCT 
1801 GATATCCGCA CCAACGCGCA GCCCGGACTC 
1861 CTGATCGTTG GCAACCAGCA TCGCAGTGGG 
1921 TTGTTGAAAA CCGGAGATGG CACTCCAGTC 
1981 ATTGCGAGTG AGATATTTAT GCCAGCCAGC 
2041 TGGGCCCGCT AACAGCGCGA TTTGCTGGTG 
2101 TCGCGTACCG TCTTCATGGG AGAAAATAAT 
2161 AAGAAATAAC GCCGGAACAT TAGTGCAGGC 
2221 CAGCGGATAG TTAATGATCA GCCCACTGAC 
2281 TTTACAGGCT TCGACGCCGC TTCGTTCTAC 
2341 ATCGGCGCGA GATTTAATCG CCGCGACAAT 
2401 GGTGGCAACG CCAATCAGCA ACGACTGTTT 
2461 AATGTAATTC AGCTCCGCCA TCGCCGCTTC 
2521 GCTGGCCTGG TTCACCACGC GGGAAACGGT 
2581 ATCGTATAAC GTTACTGGTT T C ACATTCAC 
2641 TCATGCCATA CCGCGAAAGG TTTTGCACCA 
2 701 GGGTCCTGGC CACGGGTGCG CATGATCTAG 
2761 GAAAACCTCT GACACATGCA GCTCCCGGAG 
2 821 GGGAGCAGAC AAGCCCGTCA GGGCGCGTCA 

2 8 81 ATGACCCAGT CACGTAGCGA TAGCGGAGTG 
2941 AGATTGTACT GAGAGTGCAC CATATGCGGT 
3001 AATACCGCAT CAGGCGCTCT TCCGCTTCCT 

3 061 GGCTGCGGCG AGCGGTATCA GCTCACTCAA 
3121 GGGATAACGC AGGAAAGAAC ATGTGAGCAA 
3181 AGGCCGCGTT GCTGGCGTTT TTCCATAGGC 
3241 GACGCTCAAG TCAGAGGTGG CGAAACCCGA 
3 3 01 CTGGAAGCTC CCTCGTGCGC TCTCCTGTTC 

33 61 CCTTTCTCCC TTCGGGAAGC GTGGCGCTTT 
3421 CGGTGTAGGT CGTTCGCTCC AAGCTGGGCT 

34 81 GCTGCGCCTT ATCCGGTAAC TATCGTCTTG 
3541 CACTGGCAGC AGCCACTGGT AACAGGATTA 
3601 AGTTCTTGAA GTGGTGGCCT AACTACGGCT 
3661 CTCTGCTGAA GCCAGTTACC TTCGGAAAAA 
3 721 CCACCGCTGG TAGCGGTGGT TTTTTTGTTT 
3 781 GATCTCAAGA AGATCCTTTG ATCTTTTCTA 
3 841 CACGTTAAGG GATTTTGGTC ATGAGATTAT 
3901 ATTAAAAATG AAGTTTTAAA TCAATCTAAA 
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TGTGAGCGGA TAACAATTAT AATAGATTCA 
TTCATTAAAG AGGAGAAATT AACTATGGCA 
CCCGGGGGTG GCAGCGGTTC TGGCGCAGCA 
CCGATGGTTG GTACTTTCTA CCGCACCCCA 
GGTCAGAAAG TCAACGTGGG CGATACCCTG 
CAGATCGAAG CGGACAAATC CGGTACCGTG 
GTAGAATTTG ACGAGCCGCT GGTCGTCATC 
CATCACCATA AGCTTAATTA GCTGAGCTTG 
CAGAACTCCA TCTGGATTTG TTCAGAACGC 
AGAATCCAAG CTAGCTTGGC GAGATTTTCA 
ATCACTGGAT ATACCACCGT TGATATATCC 
TTTCAGTCAG TTGCTCAATG TACCTATAAC 
TTAAAGACCG TAAAGAAAAA TAAGCACAAG 
CGCCTGATGA ATGCTCATCC GGAATTTCGT 
TGGGATAGTG TTCACCCTTG TTACACCGTT 
CTCTGGAGTG AATACCACGA CGATTTCCGG 
GCGTGTTACG GTGAAAACCT GGCCTATTTC 
GTCTCAGCCA ATCCCTGGGT GAGTTTCACC 
AACTTCTTCG CCCCCGTTTT CACCATGGGC 
ATGCCGCTGG CGATTCAGGT TCATCATGCC 
CTTAATGAAT TACAACAGTA CTGCGATGAG 
AGTTATTGGT GCCCTTAAAC GCCTGGGGTA 
ACGAAAGGCT CAGTCGAAAG ACTGGGCCTT 
TCTCCTGAGT AGGACAAATC CGCCCTCTAG 
CATGAGAATT GTGCCTAATG AGTGAGCTAA ' 
GCTTTCCAGT CGGGAAACCT GTCGTGCCAG 
AGAGGCGGTT TGCGTATTGG GCGCCAGGGT 
CAGCTGATTG CCCTTCACCG CCTGGCCCTG 
TTGCCCCAGC AGGCGAAAAT CCTGTTTGAT 
GTCTTCGGTA TCGTCGTATC CCACTACCGA 
GGTAATGGCG CGCATTGCGC CCAGCGCCAT 
AACGATGCCC TCATTCAGCA TTTGCATGGT 
GCCTTCCCGT TCCGCTATCG GCTGAATTTG 
CAGACGCAGA CGCGCCGAGA CAGAACTTAA 
ACCCAATGCG ACCAGATGCT CCACGCCCAG 
ACTGTTGATG GGTGTCTGGT CAGAGACATC 
AGCTTCCACA GCAATGGCAT CCTGGTCATC 
GCGTTGCGCG AGAAGATTGT GCACCGCCGC 
CATCGACACC ACCACGCTGG CACCCAGTTG 
TTGCGACGGC GCGTGCAGGG CCAGACTGGA 
GCCCGCCAGT TGTTGTGCCA CGCGGTTGGG 
CACTTTTTCC CGCGTTTTCG CAGAAACGTG 
CTGATAAGAG ACACCGGCAT ACTCTGCGAC 
CACCCTGAAT TGACTCTCTT CCGGGCGCTA 
TTCGATGGTG TCGGAATTTC GGGCAGCGTT 
AGCTGCCTCG CGCGTTTCGG TGATGACGGT 
ACGGTCACAG CTTGTCTGTA AGCGGAT<3CC 
GCGGGTGTTG GCGGGTGTCG GGGCGCAGCC 
TATACTGGCT TAACTATGCG GCATCAGAGC 
GTGAAATACC GCACAGATGG GTAAGGAGAA 
CGCT CACTGA CTCGCTGCGC TCGGTCGTTC 
AGGCGGTAAT ACGGTTATCC ACAGAATCAG 
AAGGCCAGCA AAAGGCCAGG AACCGTAAAA 
TCCGCCCCCC TGACGAGCAT CACAAAAATC 
CAGGACTATA AAGATACCAG GCGTTTCCCC 
CGACCCTGCC GCTTACCGGA TACCTGTCCG 
CTCATAGCTC ACGCTGTAGG TATCTCAGTT 
GTGTGCACGA ACCCCCCGTT CAGCCCGACC 
AGTCCAACCC GGTAAGACAC GACTTATCGC 
GCAGAGCGAG GTATGTAGGC GGTGCTACAG 
ACACTAGAAG GACAGTATTT GGTATCTGCG 
GAGTTGGTAG CTCTTGATCC GGCAAACAAA 
GCAAGCAGCA GATTACGCGC AGAAAAAAAG 
CGGGGTCTGA CGCTCAGTGG AACGAAAACT 
CAAAAAGGAT CTTCACCTAG ATCCTTTTAA 
GTATATATGA GTAAACTTGG TCTGACAGTT 
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3961 ACCAATGCTT AATCAGTGAG GCACCTATCT 
4021 TTGCCTGACT CCCCGTCGTG TAGATAACTA 
4081 GTGCTGCAAT GATACCGCGA GACCCACGCT 
4141 AGCCAGCCGG AAGGGCCGAG CGCAGAAGTG 
4201 CTATTAATTG TTGCCGGGAA GCTAGAGTAA 
4261 TTGTTGCCAT TGCTACAGGC ATCGTGGTGT 
4321 GCTCCGGTTC CCAACGATCA AGGCGAGTTA 
43 81 TTAGCTCCTT CGGTCCTCCG ATCGTTGTCA 
4441 TGGTTATGGC AGCACTGCAT AATTCTCTTA 
4501 TGACTGGTGA GTACTCAACC AAGTCATTCT 
4561 CTTGCCCGGC GTCAATACGG GATAATACCG 
4621 TCATTGGAAA ACGTTCTTCG GGGCGAAAAC 
4681 GTTCGATGTA ACCCACTCGT GCACCCAACT 
4741 TTTCTGGGTG AGCAAAAACA GGAAGGCAAA 
4801 GGAAATGTTG AATACTCATA CTCTTCCTTT 
4861 ATTGTCTCAT GAGCGGATAC ATATTTGAAT 
4921 CGCGCACATT TCCCCGAAAA GTGCCACCTG 
4981 TAACCTATAA AAATAGGCGT ATCACGAGGC 



CAGCGATCTG TCTATTTCGT TCATCCATAG 
CGATACGGGA GGGCTTACCA TCTGGCCCCA 
CACCGGCTCC AGATTTATCA GCAATAAACC 
GTCCTGCAAC TTTATCCGCC TCCATCCAGT 
GTAGTTCGCC AGTTAATAGT TTGCGCAACG 
CACGCTCGTC GTTTGGTATG GCTTCATTCA 
CATGATCCCC CATGTTGTGC AAAAAAGCGG 
GAAGTAAGTT GGCCGCAGTG TTATCACTCA 
CTGTCATGCC ATCCGTAAGA TGCTTTTCTG 
GAGAATAGTG TATGCGGCGA CCGAGTTGCT 
CGCCACATAG CAGAACTTTA AAAGTGCTCA 
TCTCAAGGAT CTTACCGCTG TTGAGATCCA 
GATCTTCAGC ATCTTTTACT TTCACCAGCG 
ATGCCGCAAA AAAGGGAATA AGGGCGACAC 
TTCAATATTA TTGAAGCATT TATCAGGGTT 
GTATTTAGAA AAATAAACAA ATAGGGGTTC 
ACGTCTAAGA AACCATTATT ATCATGACAT 
CCTTTCGTCT TCAC 



Figure IB 



Dra m Sph I Sma I 

115 ATGGCA CTTAGTGGGA TCCGCATGCG AGCTCGGTAC CCCGGGGGTG GCAGC 
TACCGT GAATCACCCT AGGCGTACGC TCGAGCCATG GGGCCCCCAC CGTCG 



Figure 1C 
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Figure 2 A 
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1 CAGGTGGCAC TTTTCGGGGA AATGTGCGCG GAACCCCTAT TTGTTTATTT TTCTAAATAC 



61 


ATTCAAATAT 


GTATCCGCTC 


ATGAGACAAT 


AACCCTGATA 


AATGCTTCAA 


TAATATTGAA 


121 


AAAGGAAGAG 


TATGAGTATT 


CAACATTTCC 


GTGTCGCCCT 


TATTCCCTTT 


TTTGCGGCAT 


181 


TTTGCCTTCC 


TGTTTTTGCT 


CACCCAGAAA 


CGCTGGTGAA AGTAAAAGAT 


GCTGAAGATC 


241 


AGTTGGGTGC 


ACGAGTGGGT 


TACATCGAAC 


TGGATCTCAA 


CAGCGGTAAG 


ATCCTTGAGA 


301 


GTTTTCGCCC 


CGAAGAACGT 


TTTCCAATGA 


TGAGCACTTT 


TAAAGTTCTG 


CTATGTGGCG 


361 


CGGTATTATC 


CCGTAT TGAC 


GCCGGGCAAG 


AGCAACTCGG 


TCGCCGCATA 


CACTATTCT C 


421 


AGAATGACTT 


GGTTGAGTAC 


TCACCAGTCA 


CAGAAAAGCA 


TCTTACGGAT 


GGCATGACAG 


481 


TAAGAGAATT 


ATGCAGTGCT 


GCCATAACCA 


TGAGTGATAA 


CACTGCGGCC 


AACTTACTTC 


541 


TGACAACGAT 


CGGAGGACCG 


AAGGAGCTAA 


CCGCTTTTTT 


GCACAACATG 


GGGGATCATG 


601 


TAACTCGCCT 


TGATCGTTGG 


GAACCGGAGC 


TGAATGAAGC 


CATAC CAAAC 


GACGAGCGTG 


661 


ACACCACGAT 


GCCTGTAGCA 


ATGGCAACAA 


CGTTGCGCAA 


ACTATTAACT 


GGCGAACTAC 


721 


TTACTCTAGC 


TTCCCGGCAA 


CAATTAATAG 


ACTGGATGGA 


GGCGGATAAA 


GTTGCAGGAC 


781 


CACTTCTGCG 


CTCGGCCCTT 


CCGGCTGGCT 


GGTTTATTGC 


TGATAAATCT 


GGAGCCGGTG 


841 


AGCGTGGGTC 


TCGCGGTATC 


ATTGCAGCAC 


TGGGGCCAGA 


TGGTAAGCCC 


TCCCGTATCG 


901 


TAGTTAT CTA 


CACGACGGGG 


AGTCAGGCAA 


CTATGGATGA 


ACGAAATAGA 


CAGATCGCTG 


961 


AGATAGGTGC 


CTCACTGATT 


AAGCATTGGT 


AACTGTCAGA 


CCAAGTTTAC 


TCATATATAC 


1021 


TTTAGATTGA 


TTTAAAACTT 


CATTTTTAAT 


TTAAAAGGAT 


CTAGGTGAAG 


AT CCTTTTTG 

**X V* X XXX X W 


1081 


ATAAT CTCAT 


GACCAAAATC 


CCTTAACGTG 


AGTTTTCGTT 


CCACTGAGCG 


TCAGACCCCG 


1141 


TAGAAAAGAT 


CAAAGGATCT 


TCTTGAGATC 


CTTTTTTTCT 


GCGCGTAATC 


TGCTGCTTGC 


1201 


AAACAAAAAA 


ACCACCGCTA 


CCAGCGGTGG 


TTTGTTTGCC 


GGATCAAGAG 




1261 


TTTTTCCGAA 


GGTAACTGGC 


TTCAGCAGAG 


CGCAGATACC 


AAATACTGTC 


r ir r r rP r T 1 z\ nTCT 


1321 


AGCCGTAGTT 


AGGCCACCAC 


TTCAAGAACT 


CTGTAGCACC 


GCCTACATAC 


CTCGCTPTGC' 


1381 


TAATCCTGTT 


ACCAGTGGCT 


GCTGCCAGTG 


GCGATAAGTC 


GTGTCTTACC 


GGGTT RC? A PT 


1441 


CAAGACGATA 


GTTACCGGAT 


AAGGCGCAGC 


GGTCGGGCTG 


AACGGGGGGT 


TGGTGCArAC 


1501 


AGCCCAGCTT 


GGAGCGAACG 


ACCTACACCG 


AACTGAGATA 


CCTACAGCGT 


unuLAl X \Jr\\3 


1561 


AAAGCGCCAC 


GCTTCCCGAA 


GGGAGAAAGG 


CGGACAGGTA 


TCCGGTAAGC 




1621 


GAACAGGAGA 


GCGCACGAGG 


GAGCTTCCAG 


GGGGAAACGC 


CTGGTATCTT 


TAT ART C PTR 


1681 


TCGGGTTTCG 


CCACCTCTGA 


CTTGAGCGTC 


GATTTTTGTG 


ATGCTCGTCA 


GGGGGG PGR A 


1741 


GCCTATGGAA 


AAACGCCAGC 


AACGCGGCCT 


TTTTACGGTT 


CCTGGCCTTT 


TGCTGGOCTT 


1801 


TTGCTCACAT 


GTTCTTTCCT 


GCGTTATCCC 


CTGATTCTGT 


GGATAACCGT 




1861 


TTGAGTGAGC 


TGATACCGCT 


CGCCGCAGCC 


GAACGACCGA 


GCGCAGCGAG 


TPARTPSARPfJ 


1921 


AGGAAGCCCA 


GGACCCAACG 


CTGCCCGAAA 


TTCCGACACC 


ATCGAATGGT 


GPAAAAPPTT 


1981 


TCGCGGTATG 


GCATGATAGC 


GCCCGGAAGA 


GAGTCAATTC 


AGGGTGGTGA 


ATGTGAAACC 


2041 






CAGAGTATGC 


CGGTGTCTCT 


TATCAGACCG 


TTTCCCGCGT 


2101 


GGTGAACCAG 


GCCAGCCACG 


TTTCTGCGAA AACGCGGGAA AAAGTGGAAG 


CGGCGATGGC 


2161 


GGAGCTGAAT 


TACATTCCCA 


ACCGCGTGGC 


ACAACAACTG 


GCGGGCAAAC 


AGTCGTTGCT 


2221 


GATTGGCGTT 


GCCACCTCCA 


GTCTGGCCCT 


GCACGCGCCG 


TCGCAAATTG 


TCGCGGCGAT 


2281 


TAAAT CTCGC 


GCCGATCAAC 


TGGGTGCCAG 


CGTGGTGGTG 


TCGATGGTAG 


AACGAAGCGG 


2341 


CGTCGAAGCC 


TGTAAAGCGG 


CGGTGCACAA 


TCTTCTCGCG 


CAACGCGTCA 


GTGGGCTGAT 


2401 


CATTAACTAT 


CCGCTGGATG 


ACCAGGATGC 


CATTGCTGTG 


GAAGCTGCCT 


GCACTAATGT 


2461 


TCCGGCGTTA 


TTTCTTGATG 


TCTCTGACCA GACACCCATC AACAGTATTA 


TTTTCTCCCA 


2521 


TGAAGACGGT 


ACGCGACTGG 


GCGTGGAGCA 


TCTGGTCGCA 


TTGGGTCACC 


AGCAAATCGC 
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2581 


GCTGTTAGCG 


GGCCCATTAA 


GTTCTGTCTC 


GGCGCGTCTG 


CGTCTGGCTG 


GCTGGCATAA 


2641 


ATATCTCACT 


CGCAATCAAA 


TTCAGCCGAT 


AGCGGAACGG 


GAAGGCGACT 


GGAGTGCCAT 


2701 


GTCCGGTTTT 


CAACAAACCA 


TGCAAATGCT 


GAATGAGGGC 


ATCGTTCCCA 


CTGCGATGCT 


2761 


GGTTGCCAAC 


GATCAGATGG 


CGCTGGGCGC 


AATGCGCGCC 


ATTACCGAGT 


CCGGGCTGCG 


2821 


CGTTGGTGCG 


GATATCTCGG 


TAGTGGGATA 


CGACGATACC 


GAAGACAGCT 


CATGTTATAT 


2881 


CCCGCCGTTA 


ACCACCATCA 


AACAGGATTT 


TCGCCTGCTG 


GGGCAAACCA 


GCGTGGACCG 


2941 


CTTGCTGCAA 


CTCTCTCAGG 


GCCAGGCGGT 


GAAGGGCAAT 


CAGCTGTTGC 


CCGTCTCACT 


3001 


GGTGAAAAGA 


AAAACCACCC 


TGGCGCCCAA 


TACGCAAACC 


GCCTCTCCCC 


GCGCGTTGGC 


3061 


CGATTCATTA 


ATGCAGCTGG 


CACGACAGGT 


TTCCCGACTG 


GAAAGCGGGC 


AGTGAGCGCA 


3121 


ACGCAATTAA 


TGTGAGTTAG 


CTCACTCATT 


AGGCACAATT 


CTCATGTTTG 


ACAGCTTATC 


3181 


ATCGACTGCA 


CGGTGCACCA 


ATGCTTCTGG 


CGTCAGGCAG 


CCATCGGAAG 


CTGTGGTATG 


3241 


GCTGTGCAGG 


TCGTAAATCA 


CTGCATAATT 


CGTGTCGCTC 


AAGGCGCACT 


CCCGTTCTGG 


3301 


ATAATGTTTT 


TTGCGCCGAC 


ATCATAACGG 


TTCTGGCAAA 


TATTCTGAAA 


TGAGCTGTTG 


3361 


ACAATTAATC 


ATCGGCTCGT 


ATAATGTGTG 


GAATTGTGAG 


CGGATAACAA 


TTTCACACAG 


3421 


GAAACACATA 


TGAACGACTT 


TCATCGCGAT 


ACGTGGGCGG 


AAGTGGATTT 


GGACGCCATT 


3481 


TACGACAATG 


TGGCGAATTT 


GCGCCGTTTG 


CTGCCGGACG 


ACACGCACAT 


T ATGG CGGT C 


3541 


GTGAAGGCGA 


ACGCCTATGG 


ACATGGGGAT 


GTGCAGGTGG 


CAAGGACAGC 


GCTCGAAGCG 


3601 


GGGGCCTCCC 


GCCTGGCGGT 


TGCCTTTTTG 


GATGAGGCGC 


TCGCTTTAAG 


GGAAAAAGGA 


3661 


ATCGAAGCGC 


CGATTCTAGT 


TCTCGGGGCT 


TCCCGTCCAG 


CTGATGCGGC 


GCTGGCCGCC 


3721 


CAGCAGCGCA 


TTGCCCTGAC 


CGTGTTCCGC 


TCCGACTGGT 


TGGAAGAAGC 


GTCCGCCCTT 

%J X WWW^W X X 


3781 


TACAGCGGCC 


CTATTCCTAT 


TCATTTCCAT 


TTGAAAATGG 


ACACCGGCAT 


GGGACGGCTT 


3841 


GGAGTGAAAG 


ACGAGGAGGA 


GACGAAACGA 


ATCGCAGCGC 


TGATTGAGCG 


CCATCCGCAT 


3901 


TTTGTGCTTG 


AAGGGGCGTA 


CACGCATTTT 


GCGACTGCGG 


ATGAGGTGAA 


CACCGATTAT 


3961 


TTTTCCTATC 


AGTATACCCG 


TTTTTTGCAC 


ATGCTCGAAT 


GGCTGCCGTC 


GCGCCCGCCG 


4021 


CTCGTCCATT 


GCGCCAACAG 


CGCAGCGTCG 


CTCCGTTTCC 


CTGACCGGAC 


GTTPAATATG 

v? x x wvixnx vj 


4081 


GTCCGCTTCG 


GCATTGCCAT 


GTATGGGCTT 


GCCCCGTCGC 


CCGGCATCAA 


GCCGCTGPTG 


4141 


CCGTATCCAT 


TAAAAGAAGC 


ATTTTCGCTC 


CATAGCCGCC 


T CGTACACGT 


PAAAAAAPTG 


4201 


CAACCAGGCG 


AAAAGGTGAG 


CTATGGTGCG 


ACGTACACTG 


CGCAGACGGA 


GGAGTGGA'TP 


4261 


GGGACGATTC 


CGATCGGCTA 


TGCGGACGGC 


TGGCTCCGCC 


GCCTGCAGCA 


PTTTPATf^TP 

V» X X X V_~rTk X VJ X V- 


4321 


CTTGTTGACG 


GACAAAAGGC 


GCCGATTGTC 


GGCCGCATTT 


GCATGGACCA 


GTGCATGATP 


4381 


CGCCTGCCTG 


GGCCGCTGCC 


GGTCGGCACG 


AAGGTGACAC 


TGATTGGTPf? 


V— V__tt\J VJVJ vjvj/\\— 


44.41 


VjttAjVjJLAAx 1 1 


CCATTGATGA 


TGT CGCTCGC 


CATTTGGAAA 


CGATCAACTA 


CGAAGTGCCT 


4501 


TGCACGATCA 


GCTATCGAGT 


GCCCCGTATT 


TTTTTC CGCC 


ATAAGCGTAT 


AATGGAAGTG 


4561 


AGAAACGCCA 


TTGGCCGCGG 


GGAAAGCAGT 


GCACATCACC 


ATCACCATCA 


CTAAAAGCTT 


4621 


GGATCCGAAT 


TCAGCCCGCC 


TAATGAGCGG 


GCTTTTTTTT 


GAACAAAATT 


AGCTTGGCTG 


4681 


TTTTGGCGGA 


TGAGAGAAGA 











Figure 2B 
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1 ATGGCTCTCA TCCCAGACTT GGCCATGGAA ACCTGGCTTC TCCTGGCTGT 
CAGCCTGGTG 

61 CTCCTCTATC TATATGGAAC CCATTCACAT GGACTTTTTA AGAAGCTTGG 
AATTCCAGGG 

121 CCCACACCTC TGCCTTTTTT GGGAAATATT TTGTCCTACC ATAAGGGCTT 
TTGTATGTTT 

181 GACATGGAAT GTCATAAAAA GTATGGAAAA GTGTGGGGCT TTTATGATGG 
TCAACAGCCT 

241 GTGCTGGCTA TCACAGATCC TGACATGATC AAAACAGTGC TAGTGAAAGA 
ATGTTATTCT 

3 01 GTCTTCACAA ACCGGAGGCC TTTTGGTCCA GTGGGATTTA TGAAAAGTGC 
CATCTCTATA 

3 61 GCTGAGGATG AAGAATGGAA GAGATTACGA TCATTGCTGT CTCCAACCTT 
CACCAGTGGA 

421 AAACTCAAGG AGATGGTCCC TATCATTGCC CAGTATGGAG ATGTGTTGGT 
GAGAAATCTG 

4 81 AGGCGGGAAG CAGAGACAGG CAAGCCTGTC ACCTTGAAAG ACGTCTTTGG 
GGCCTACAGC 

541 ATGGATGTGA TCACTAGCAC ATCATTTGGA GTGAACATCG ACTCTCTCAA 
CAATCCACAA 

601 GACCCCTTTG TGGAAAACAC CAAGAAGCTT TTAAGATTTG ATTTTTTGGA 
TCCATTCTTT 

661 CTCTCAATAA CAGTCTTTCC ATTCCTCATC CCAATTCTTG AAGTATTAAA 
TATC TGTGTG 

721 TTTCCAAGAG AAGTTACAAA TTTTTTAAGA AAATCTGTAA AAAGGATGAA 
AGAAAGTCGC 

781 CTCGAAGATA CACAAAAGCA CCGAGTGGAT TTCCTTCAGC TGATGATTGA 
CTCTCAGAAT 

841 TCAAAAGAAA CTGAGTCCCA CAAAGCTCTG TCCGATCTGG AGCTCGTGGC 
CCAATCAATT 

901 ATCTTTATTT TTGCTGGCTA TGAAACCACG AGCAGTGTTC TCTCCTTCAT 
TATGTATGAA 

961 CTGGCCACTC ACCCTGATGT CCAGCAGAAA CTGCAGGAGG AAATTGATGC 
AGTTTTACCC 

1021 AATAAGGCAC CACCCACCTA TGATACTGTG CTACAGATGG AGTATCTTGA 
CATGGTGGTG 

1081 AATGAAACGC TCAGATTATT CCCAATTGCT ATGAGACTTG AGAGGGTCTG 
CAAAAAAGAT 

1141 GTTGAGATCA ATGGGATGTT CATTCCCAAA GGGGTGGTGG TGATGATTCC 
AAGCTATGCT 

12 01 CTTCACCGTG ACCCAAAGTA CTGGACAGAG CCTGAGAAGT TCCTCCCTGA 
AAGATTCAGC 

12 61 AAGAAGAACA AGGACAACAT AGATCCTTAC ATATACACAC CCTTTGGAAG 
TGGACCCAGA 

1321 AACTGCATTG GCATGAGGTT TGCTCTCATG AACATGAAAC TTGCTCTAAT 
CAGAGTCCTT 

13 81 CAGAACTTCT CCTTCAAACC TTGTAAAGAA ACACAGATCC CCCTGAAATT 
AAGCTTAGGA 

1441 GGACTTCTTC AACCAGAAAA ACCCGTTGTT CTAAAGGTTG AGTCAAGGGA 

TGGCACCGTA 

1501 AGTGGAGCCT GA 



Figure 3A 
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1 MALI PDLAME TWLLLAVSLV LLYLYGTHSH 

61 DMECHKKYGK VWGFYDGQQP VLAITDPDMI 

121 AEDEEWKRLR SLLSPTFTSG KLKEMVPIIA 

181 MDVITSTSFG VNIDSLKNPQ DPFVENTKKL 

241 FPREVTNFLR KSVKRMKESR LEDTQKHRVD 

301 I F I FAGYETT SSVLSFIMYE IiATHPDVQQK 

361 NETLRLFP I A MRLERVCKKD VEINGMFIPK 

421 KKNKDNIDPY IYTPFGSGPR NCIGMRFALM 

481 GLLQPEKPW LKVESRDGTV SGA* 

Figure 3B 



GLFKKLGIPG PTPLPFLGNI LSYHKGFCMF 
KTVLVKECYS VFTNRRPFGP VGFMKSAISI 
QYGDVLVRNL RREAETGKPV TLKDVFGAYS 
LRFDFLDPFF LSITVFPFLI PILEVLNICV 
FLQLMIDSQN SKETESHKAL SDLELVAQSI 
LQEEIDAVLP NKAPPTYDTV LQMEYLDMW 
GVWMIPSYA LHRDPKYWTE PEKFLPERFS 
NMKLALIRVL QNFSFKPCKE TQIPIiKLSLG 
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1 ATGGATTCTC TTGTGGTCCT TGTGCTCTGT 
61 AGACAGAGCT CTGGGAGAGG AAAACTCCCT 
121 AATATCCTAC AGATAGGTAT TAAGGACATC 
181 TATGGCCCGG TGTTCACTCT GTATTTTGGC 
241 GAAGCAGTGA AGGAAGCCCT GATTGATCTT 

3 01 CCACTGGCTG AAAGAGCTAA CAGAGGATTT 
361 AAGGAGATCC GGCGTTTCTC CCTCATGACG 
421 ATTGAGGACC GTGTTCAAGA GGAAGCCCGC 

4 81 GCCTCACCCT GTGATCCCAC TTTCATCCTG 
541 ATTATTTTCC ATAAACGTTT TGATTATAAA 
601 TTGAATGAAA ACATCAAGAT TTTGAGCAGC 
661 CCTATCATTG ATTACTTCCC GGGAACTCAC 
721 AAAAGTTATA TTTTGGAAAA AGTAAAAGAA 
781 CAGGACTTTA TTGATTGCTT CCTGATGAAA 
841 GAATTTACTA TTGAAAGCTT GGAAAACACT 
901 ACGACAAGCA CAACCCTGAG ATATGCTCTC 
961 GCTAAAGTCC AGGAAGAGAT TGAACGTGTG 

1021 GACAGGAGCC ACATGCCCTA CACAGATGCT 
1081 CTTCTCCCCA CCAGCCTGCC CCATGCAGTG 
1141 ATTCCCAAGG GCACAACCAT ATTAATTTCC 
1201 TTTCCCAACC CAGAGATGTT TGACCCTCAT 
1261 AAAAGTAAAT ACTTCATGCC TTTCTCAGCA 
1321 GCCGGCATGG AGCTGTTTTT ATTC CTGACC 
13 81 CTGGTTGACC CAAAGAACCT TGACACCACT 
1441 CCCTTCTACC AGCTGTGCTT CATTCCTGTC 
1501 GTGCAGTCCC TGCAGCTCTC TTTCCTCTGG 
1561 GCCTTTTCTC ACCTGTCATC TCACATTTTC 
1621 CTCCATTACG GAGAGTTTCC TATGTTTCAC 
1681 CTGTAACAGT TGCATTGACT GTCACATAAT 
1741 ATGTTATTAT TAAATAGAGA AATATGATTT 
1801 TGCATGATCT AAATAAAAAG CATTATTATT 



CTCTCATGTT TGCTTCTCCT TTCACTCTGG 
CCTGGCCCCA CTCCTCTCCC AGTGATTGGA 
AGCAAATCCT TAACCAATCT CTCAAAGGTC 
CTGAAACCCA TAGTGGTGCT GCATGGATAT 
GGAGAGGAGT TTTCTGGAAG AGGCATTTTC 
GGAATTGTTT TCAGCAATGG AAAGAAATGG 
CTGCGGAATT TTGGGATGGG GAAGAGGAGC 
TGCCTTGTGG AGGAGTTGAG AAAAACCAAG 
GGCTGTGCTC CCTGCAATGT GATCTGCTCC 
GATCAGCAAT TTCTTAACTT AATGGAAAAG 
CCCTGGATCC AGATCTGCAA TAATTTTTCT 
AACAAATTAC TTAAAAACGT TGCTTTTATG 
CACCAAGAAT CAATGGACAT GAACAACCCT 
ATGGAGAAGG AAAAGCACAA CCAACCATCT 
GCAGTTGACT TGTTTGGAGC TGGGACAGAG 
CTTCTCCTGC TGAAGCACCC AGAGGTCACA 
ATTGGCAGAA ACCGGAGCCC CTGCATGCAA 
GTGGTGCACG AGGTCCAGAG ATACATTGAC 
ACCTGTGACA TTAAATTCAG AAACTATCTC 
CTGACTTCTG TGCTACATGA CAACAAAGAA 
CACTTTCTGG ATGAAGGTGG CAATTTTAAG 
GGAAAACGGA TTTGTGTGGG AGAAGCCCTG 
TCCATTTTAC AGAACTTTAA CCTGAAATCT 
CCAGTTGTCA ATGGATTTGC CTCTGTGCCG 
TGAAGAAGAG CAGATGGCCT GGCTGCTGCT 
GGCATTATCC AT CT TTGCAC TATCTGTAAT 
CCTTCCCTGA AGATCTAGTG AACATTCGAC ■ 
TGTGCAAATA TATCTGCTAT TCTCCATACT 
GCTCATACTT ATCTAATGTA GAGTATTAAT 
GTGTATTATA ATTCAAAGGC ATTTCTTTTC 
TGCTG 



Figure 4A 



1 MDSLWLVLC LSCLLLLSLW RQSSGRGKLP 
61 YGPVFTLYFG LKPIWLHGY EAVKEALIDL 
121 KEIRRFSLMT LRJNFGMGKRS I EDRVQEE AR 
181 1 1 FHKRFDYK DQQFLNLMEK LNENIKILSS 
241 KSYILEKVKE HQESMDMNNP QDFIDCFLMK 
301 TTSTTLRYAL LLLLKHPEVT AKVQEE I ERV 
361 LLPTSLPHAV TCDIKFRNYL IPKGTTILIS 
421 KSKYFMPFSA GKRICVGEAL AGMELFLFLT 
481 PFYQLCFIPV *RRADGLAAA VQSLQLSFLW 
541 LHYGEFPMFH CANISAILHT L*QLH*LSHN 
601 CMI*IKSIII C 



PGPTPLPVIG NILQIGIKDI SKSLTNLSKV 
GEEFSGRGIF PLAERANRGF GIVFSNGKKW 
CLVEELRKTK ASPCDPTFIL GCAPCNVICS 
PWIQICNNFS PIIDYFPGTH NKLLKNVAFM 
MEKEKHNQPS EFTIESLENT AVDLFGAGTE 
I GRNRS PCMQ DRSHMPYTDA WHEVQRY ID 
LTSVLHDNKE FPNPEMFDPH HFLDEGGNFK 
SILQNFNLKS LVDPKNLDTT PWNGFASVP 
GIIHLCTICN AFSHLSSHIF PSLKI**TFD 
AHTYLM*SIN MLLLNREI *F VYYNSKAFLF 



Figure 4B 
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1 ATGGGGCTAG AAGCACTGGT GCCCCTGGCC GTGATAGTGG CCATCTTCCT GCTCCTGGTG 
61 GACCTGATGC ACCGGCGCCA ACGCTGGGCT GCACGCTACC CACCAGGCCC CCTGCCACTG 
121 CCCGGGCTGG GCAACCTGCT GCATGTGGAC TTCCAGAACA CACCATACTG CTTCGACCAG 
181 TTGCGGCGCC GCTTCGGGGA CGTGTTCAGC CTGCAGCTGG CCTGGACGCC GGTGGTCGTG 
241 CTCAATGGGC TGGCGGCCGT GCGCGAGGCG CTGGTGACCC ACGGCGAGGA CACCGCCGAC 
301 CGCCCGCCTG TGCCCATCAC CCAGATCCTG GGTTTCGGGC CGCGTTCCCA AGGGGTGTTC 
361 CTGGCGCGCT ATGGGCCCGC GTGGCGCGAG CAGAGGCGCT TCTCCGTGTC CACCTTGCGC 
421 AACTTGGGCC TGGGCAAGAA GTCGCTGGAG CAGTGGGTGA CCGAGGAGGC CGCCTGCCTT 
481 TGTGCCGCCT TCGCCAACCA CTCCGGACGC CCCTTTCGCC CCAACGGTCT CTTGGACAAA 
541 GCCGTGAGCA ACGTGATCGC CTCCCTCACC TGCGGGCGCC GCTTCGAGTA CGACGACCCT 
601 CGCTTCCTCA GGCTGCTGGA CCTAGCTCAG GAGGGACTGA AGGAGGAGTC GGGCTTTCTG 
661 CGCGAGGTGC TGAATGCTGT CCCCGTCCTC CTGCATATCC CAGCGCTGGC TGGCAAGGTC 
721 CTACGCTTCC AAAAGGCTTT CCTGACCCAG CTGGATGAGC TGCTAACTGA GCACAGGATG 
781 ACCTGGGACC CAGCCCAGCC CCCCCGAGAC CTGACTGAGG CCTTCCTGGC AGAGATGGAG 
841 AAGGCCAAGG GGAACCCTGA GAGCAGCTTC AATGATGAGA ACCTGCGCAT AGTGGTGGCT 
901 GACCTGTTCT CTGCCGGGAT GGTGACCACC TCGACCACGC TGGCCTGGGG CCTCCTGCTC 
961 ATGATCCTAC ATCCGGATGT GCAGCGCCGT GTCCAACAGG AGATCGACGA CGTGATAGGG 
1021 CAGGTGCGGC GACCAGAGAT GGGTGACCAG GCTCACATGC CCTACACCAC TGCCGTGATT 
10 81 CATGAGGTGC AGCGCTTTGG GGACATCGTC CCCCTGGGTA TGACCCATAT GACATCCCGT 
1141 GACATCGAAG TACAGGGCTT CCGCATCCCT AAGGGAACGA CACTCATCAC CAACCTGTCA 
1201 TCGGTGCTGA AGGATGAGGC CGTCTGGGAG AAGCCCTTCC GCTTCCACCC CGAACACTT C 
1261 CTGGATGCCC AGGGCCACTT TGTGAAGCCG GAGGCCTTCC TGCCTTTCTC AGCAGGCCGC 
1321 CGTGCATGCC TCGGGGAGCC CCTGGCCCGC ATGGAGCTCT TCCTCTTCTT CACCTCCCTG 
13 81 CTGCAGCACT TCAGCTTCTC GGTGCC CACT GGACAGCCCC GGCCCAGCCA CCATGGTGTC 
1441 TTTGCTTTCC TGGTGAGCCC ATCCCCCTAT GAGCTTTGTG CTGTGCCCCG CTAG 



Figure 5A 



1 MGLEALVPLA VIVAIFLIiLV DLMHRRQRWA ARYPPGPLPL PGLGNLLHVD FQNTPYCFDQ 
61 LRRRFGDVFS L.QLAWTPWV LNGLAAVREA LVTHGEDTAD RPPVPITQIL GFGPRS QGVF 
121 LARYGPAWRE QRRFSVSTLR NLGLGKKSLE QWVTEEAACL CAAFANHSGR PFRPNGLLDK 
181 AVSNVIASLT CGRRFEYDDP RFLRLLDLAQ EGLKEESGFL REVLNAVPVL LHI PALAGKV 
241 LRFQKAFLTQ LDELLTEHRM TWDPAQPPRD LTEAFLAEME KAKGNPESSF NDENLRIWA 
301 DLFSAGMVTT STTLAWGUjL MILHPDVQRR VQQEIDDVIG QVRRPEMGDQ AHMPYTTAVI 
361 HEVQRFGDIV E^LGMTHMTSR DIEVQGFRIP KGTTLITNLS SVLKDEAVWE KPFRFHPEHF 
421 LDAQGHFVKP EAFLPFSAGR RACLGEPLAR MELFIiFFTSL LQHFSFSVPT GQPRPSHHGV 
4 81 FAFLVSPSPY ELCAVPR* 



Figure SB 
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Figure 7 
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Figure 8 
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Figure 9 
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Figure 10 
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Conversion of DBF to Fluorescein by Tagged 
Immobilised P450 3A4 
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Stability of Immobilised and soluble CYP2D6 




100 i 








ivity Remaning 

3 O O 


n ' 






20 


□ soluble 

■ immobilised 








I 






O 


0 2 4 6 
Time (days) 





Figure 12 



17/19 



WO 2004/025244 



PCT/IB2003/005258 



3A4*WT 




100 



200 



(uM) 



3A4*2 




200 



3A4*3 



3A4*4 




200 




200 



3A4*5 



3A4*15 




200 




200 



Figure 13 
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